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Abstract 

E. Coli. dihydrofolate reductase (DHFR) catalyzes the reduction of dihydrofolate to tetrahydrofolate. 
During the catalytic cycle, DHFR undergoes conformational transitions between the closed (CS) and 
occluded (OS) states which, respectively, describe whether the active site is closed or occluded by 
the Met20 loop. The CS^OS and the reverse transition may be viewed as allosteric transitions. 
Using a sequence-based approach we identify a network of residues that represents the allostery wiring 
diagram. Many of the residues in the allostery wiring diagram, that are dispersed throughout the 
adenosine binding domain as well as the loop domain, are not conserved. Several of the residues in the 
network have been previously shown by NMR experiments, mutational studies, and molecular dynamics 
simulations to be linked to equilibration conformational fluctuations of DHFR. To further probe the 
nature of events that occur during conformational fluctuations we use a self-organized polymer model 
to monitor the kinetics of the CS^OS and the reverse transitions. During the CS— >OS transition, 
coordinated changes in a number of residues in the loop domain enable the Met20 loop to slide along 
the a-helix in the adenosine binding domain. Sliding is triggered by pulling of the Met20 loop by 
the /3G-/3H loop and pushing action of the /3G-/3H loop. The residues that facilitate the Met20 loop 
motion are part of the network of residues that transmit allosteric signals during the CS^OS transition. 
Replacement of M16 and G121, whose C a atoms are about 4.3A in the CS, by a disulfide crosslink 
impedes that CS^OS transitions. The order of events in the OS^CS transition is not the reverse of 
the forward transition. The contact Glul8-Ser49 in the OS state persists until the sliding of the Met20 
loop is nearly completed. The ensemble of structures in the transition state (TS) in both the allosteric 
transitions are heterogeneous. The most probable TS structure resembles the OS (CS) in the CS— >OS 
(OS^CS) transition which is in accord with the Hammond postulate. Structures resembling the OS 
(CS) are present as minor (~ (1 — 3)%) component in equilibrated CS (OS) structures. 
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Introduction 



Conformationa 



catalysis 



is q y, y, 



fluctuations of proteins have been argued to play a central role in enzyme 



4j. Such a concept is appealing because the energy landscape of enzymes 
even in the folded state is rugged [5j, and hence thermal energy might be sufficient to access 
several conformational substates during a typical reaction cycle In recent years, results 
from a number of studies have been used to propose that dynamic motions in a network of 
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evant structural transitions may be encoded in the protein 
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16| . While it is difficult to unambiguously demonstrate 



whether collective dynamics involving a network of residues facilitates catalysis 12J, it is clear 
that enzymes sample a number of distinct states during a reaction cycle. In the best studied 
example of E. coli dihydrofolate reductase (DHFR) the role of the conformational motions in the 



enzyme in facilitating the hydride transfer has been linked using mutational studies 
NMR relaxation dispersion measurements 
time scale, molecular dynamics simulations 
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22j that probe the dynamics on fis to ms 



13| . and sequence analysis |6j. 



The emphasis on correlated motions on enzyme catalysis has been repeatedly questioned 
by Warshel and coworkers [l7] who have shown that catalytic rates are largely affected by 
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26|. Thus, 



changes in the free energy barriers (Ag*) in the chemical reaction step 
mutational effects or other constraints simply alter Ag$ and hence the catalytic rates. Complex 
enzyme motion in a multidimensional free energy landscape is to a large extent orthogonal 
to the dynamics along the optimized reaction coordinate. Regardless of the rate of correlated 
dynamical motions on the rates of hydride transfer in DHFR, it is known that the enzyme cycles 
through a number of states during the catalytic cycle. The dynamics of transition between such 
states (referred to as allosteric states) is the topic of interest in this study. Whether the time 
scales in such conformational transitions occur are linked to catalysis is unclear 

DHFR catalyzes the reduction of 7,8 dihydrofolate (DHF) to 5,6,7,8 tetrahydrofolate (THF) 
[lj]. By binding the co-factor, nicotinamide adenine dinucleotide phosphate (NADPH), hydride 
transfer from NADPH to protonated DHF leads to production of NADP+ and THF [27J. DHFR, 
which is required for normal folate metabolism in prokaryotes and eukaryotes, plays an important 
role in cell growth and proliferation in prokaryotes and eukaryotes 28] • As the result of its 
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obvious clinical importance, it has been studied extensively using a wide range of experimental 
and theoretical methods [lj]. 

High resolution crystal structures show that the E. Coli DHFR enzyme has eight /3-strands 
and four a-helices interspersed with flexible loops that connect the secondary structural elements 



29, 



30j. The structure of DHFR can be partitioned into adenosine binding and loop subdomains 



30l | . In the catalytic cycle, Met20 loop changes conformation between closed (CS) and occluded 



(OS) states (Fig. [T]). Interactions through hydrogen bond network with the /3F-/3G loop (residues 



117-131) stabilize the CS [21]. 

The crystal structures of E. coli DHFR complexes in the catalytic cycle have given a detailed 



map of the structural changes that occur in the enzyme |30J|. In addition, the conformational 
changes in E.coli DHFR in response to binding have been inferred using various experimental 
;ech niques, including X-ray crystallography, fluorescence, nuclear magnetic resonance (NMR) 
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29|, 



30| | . Comparison of the CS and OS structures shows that the conformations of Met20 
loop undergoes the largest change during the reaction cycle. As a result, the states of DHFR are 
classified using the conformations of the Met20 loop. The active site is either closed, or occluded 
depending on the conformation of the Met20 loop (Fig. [1]). Thus, the motion of the Met20 loop 
coordinates the dynamical changes in DHFR during the different stages of the catalytic cycle. 

Although the structures of the CS and OS states are known the dynamic pathways connecting 
the two allosteric states have not been characterized l|, [2] . In this paper, we address the following 
questions: (1) Can the evolutionary footprints in the DHFR family of sequences be used to 
obtain a network of residues in DHFR that is linked to the CS— >OS and the reverse transition 
in the enzyme? If so, what role do these residues play in the kinetics of CS^OS and the reverse 
transitions? (b) What are the pathways and the nature of the kinetics associated with transition 
from OS to CS and back? (c) What are the structures of the transition state ensemble in the 
OS— >CS transition and in the reverse reaction? We use a combination of bioinformatics methods 



32j, |33(, and Brownian dynamics simulations of coarse-grained models of DHFR to address these 
questions. It should be emphasized that our study focuses only on the kinetics of CS— >OS and 
OS— >CS transitions, and not on whether the motions that drive these transitions affect hydride 
transfer reactions. The precise linkage between equilibrium or dynamics motions of proteins and 
catalysis continues to be a topic of debate jj, 



In order to determine the network of residues in DHFR that regulates the allosteric transitions 
we adopt a se quen ce-based method 



33J, which is based on the Statistical Coupling Analysis 
■ SCA, 3,.*,*,*,. S, A — S many residues that are dispersed between the two 
subdomains as being relevant in the function of DHFR. Although several of these residues 
are not strongly conserved, they are predicted to covary across the DHFR family. In order to 
probe alloster y in DHFR, we carried out simulations using coarse-grained Self-organized polymer 
(SOP) model 3^]. The Brownian dynamics simulations reveal the dynamical changes that occur 
during the CS— >OS and OS— >CS transitions. The conformational changes in the Met20 loop, 
which occur by a sliding motion along a helix in the adenosine binding domain, is preceded 
by coordinated rupture of interactions between Met20 and /3F-/3G loops and the formation 
of contacts between Met20 and /3G-/3H loops. Simulations in which Met 16 and Glyl21 are 
crosslinked by a disulfide bond, show that the CS— >-OS transition is dramatically affected. In 
accord with the recent NMR experiments 0, [3^], we find a small (~ (1 — 3)%) of OS (CS) 
structures are populated by thermal fluctuations when DHFR is in the CS (OS) state. The 
structures of the transition state ensemble (TSE) is broad both in the forward and reverse 
direction. The presence of broad TSE and small barrier separating the CS and OS states 
supports the conformational selection model that posits that due to the heterogeneous nature 
of fluctuations conformations resembling the OS state are present in the CS and vice versa. 
Results and Discussion 

Allostery wiring diagram shows that key residues are dispersed throughout the 
structure 



We obtained 526 sequences for the DHFR family from Pfam [40] (entry 00186), and realigned 
them using the Clustalw package j^lj] . We manually deleted certain sequences, and generated a 
multiple sequence alignment (MSA) that contained 462 sequences. Each of the 462 sequences 
has 323 residues including gaps. With the fraction of the sequences in the subalignment set to 
/=0.35 (see Methods) in the SCA, there are 74 allowed perturbations (Sj = for j = 1, 2, 3...74) 



at the various positions in the DHFR family. We used the clustering protocol [42|, |43j to 
identify the set of co- varying residues. After rescaling the AAGjj matrix (Eq. H] in Methods) 
(i = 1,2, 3, ...158 and j = 1, 2, 3, ....74), and using the Euclidean similarity measure in the 



coupled two-way clustering algorithm 



33| we obtained a cluster of 21 residues and a cluster of 



19 perturbations. As in our previous work [33j, we propose that the residues that are clustered 
both in positions and perturbations constitute the minimal robust network of residues that signal 
the kinetics of the CS— > OS transition and back. The relevant network of spatially separated 
residues constitutes an allostery wiring diagram (Fig. EK), and may encode for the promoting 
motions. 

To determine if the residues in the network predicted by SCA merely reflect sequence con- 
servation we calculated the sequence entropy Si = — Y^=iP x ^ n (Pi)- For a perfectly conserved 
residue, Si — 0. If we assume that a residue is strongly conserved if Si < 0.1, then there are very 
few residues with high sequence conservation. These are G15, P21, W22, T35, G43, L54, R57, 
G95, and G96 (Fig. EJ3) . Sequence entropy is too restrictive in assessing the nature of mutations 
that are tolerated at a given position. The allowed variations in the amino acid substitution is 



better captured using the chemical sequence entropy 



Scse = -Y^ x =iP x M(P X i) where the 



twenty amino acids are divided into four classes, namely, Hydrophobic (H), Polar (P), positively 



charged (+), and negatively charged (-) [44|. Using chemical sequence entropy, residues, namely, 
114, G15, M20, P21, W22, D27, F31, T35, V40, 141, M42, G43, T46, W47, S49, 150, G51, L54, 
R57, 160, 161, L62, S63, 191, M92, V93, G95, G96, V99, Y100, L110, T113, 1115, and F125, are 
strongly (Scse < 0.1) conserved. 

It is not surprising that many of the residues identified in the allostery wiring diagram (Fig. 
[2]) are also strongly conserved as they are associated directly with the binding surface that 
stabilize the closed conformation. The SCA also identifies residues N18, L28, K38, V72, S77, 
A84, G97, and Q108 that are neither highly conserved nor adjacent to conserved residues. It 
appears that many of the residues in the network are relevant for executing dynamical motions 
that drive the allosteric transitions in DHFR or for cofactor binding. For example, N18 forms 



contact with H124 in the CS. Similarly, during the OS— >CS transition, L28 comes c 



upon binding of various ligands which results in the closure of the active site cleft 



ose to 150 



45|. Residue 



K38, which is in the hinge region, facilitates rotation of the adenosine binding domain towards 



the loop domain (residues M1-D37 and A107-R159) 

The key residues in the allostery wiring diagram have been shown in previous theoretical and 
experimental studies to be important either in catalysis or in binding of cofactors. Benkovic and 
coworkers showed that mutations of residues (M42 and G121) that are far from the active site 



affect the hydride transfer rates [l8|, |46| . Based on equilibrium covariance matrix fluctuations 
of the C a atoms obtained from all atom MD simulations, Rod et al showed that interactions of 
M42 with other residues (H45, D28, S49) would also be involved in the CS— >OS conformational 

n 

transition [13||. Mutations of positions M42 and/or G121, that lead to anti-correlated motions 
between the two subdomains, are found to be part of the predicted allostery wiring diagram. 
Hammes-Schiffer used sequence conservation of a small dataset of DHFR sequences to identify 
a network of residues whose coordinated motion is apparently linked to catalysis Q, 7|. Among 
them, 114 is found to be in the allostery wiring diagram that we have identified using SCA. 
It was also found that motions of residues W22, D27, M42, 160, L62, and T113 which forms 
hydrogen bond network with DHF in the active site might also be involved in coupled promoting 
motions jflQ]. Taken together, the present and previous studies show that the allostery wiring 
diagram, that represents the network of signaling residues in DHFR, is spread throughout the 
structure (Fig. [2]). More importantly, many non-conserved residues are part of the network. 
Motions between the two subdomains in the CS and OS states are anticorrelated 
The Root Mean Square Deviation (RMSD) between the closed and occluded crystal structures 
of E. Coll. DHFR is only 1.18 A. However, the RMSD of the active Met20 loop (residues A9- 
L24) between the two end point structures is alomost three times larger (« 3.35 A). In order to 
assess the differences in the structures of the two states at finite temperature, we equilibrated 
the OS and CS conformations at 300 K. Comparison of the thermally averaged contact maps 
shows that the closed state differs from the occluded conformation mainly in the Met20 loop 
and the secondary structural elements that are affected by the motions (see below) of the Met20 
loop (data not shown). The largest changes occur in the /3F-/3G (D116-D132) and /3G-/3H 
loop (D142-S150) loops, and the a-helix H2 of the adenosine binding domain (residues R44-I50). 
The crystal structures and the thermally equilibrated CS and OS states also show that, in the 
CS— >OS transition the conformational fluctuations in the Met20 loop have to be accompanied by 
the following changes: (1) Contacts between the Met20 and /3F-/3G loops should be ruptured. 
(2) Interactions between helix 2 and the Met20 loop should be disrupted, and reform in a 
different location; (3) Stabilizing contacts between Met20 and /3G-/3H loops should form. If 
these processes are disrupted then it is likely that the catalytic efficiency of DHFR may be 
compromised. Indeed, experimental findings of the importance of mutating M42, G121, S148 



or any two of these residues on the hydride transfer rates can be rationalized based solely on a 
static picture [13]. 



The correlated motions in DHFR are computed using time average covariance matrix defined 

as, 

(C y (X)> = -L f° bS Ar,(f) • Arj(t)dt (1) 

- 1 - obs JO 

where Afj(t) = (f^(t) — rf^/^iit) — r^ t \ is the unit vector of the displacement of the i th C a 
atom with respect to its initial value, and X is either CS or OS. The direction of motion of 
the i th residue is given by Afj. If (CV,) is positive then the motion of the two residues i and 
j are correlated while negative values correspond to anti-correlation. For perfectly correlated 
(anti-correlated) residues (Cy) is +1 (-1). The covariance map for the CS shows anti-correlated 
motion between the Met20 loop and the adenosine binding domain, as well as between the f3F- 
f3G and /3G-/3H loops (see the dark blue regions in Fig. [3]). Thus, the adenosine binding domain 
and the loop domains move in an anti-correlated manner. Similar conclusions were obtained 



using al 



atom simulations of the WT DHFR [l3j. The present simulations and previous MD 



studies f], Q, [si, [9I, Q, Q point to the importance of correlated motions between regions that 
are spatially well separated. The cross correlations in the inter-domain motions shown in Fig. 
|3] are obtained by averaging the structural fluctuations over 0.1 ms, and may well be relevant 
in facilitating cofactor binding and solvent rearrangement needed for catalysis [26]. The static 
picture alone is not sufficient to describe the kinetics of transitions between the CS and OS. 
Only by probing the kinetics of conformational fluctuations in the CS-^OS (and the backward) 
transitions we can predict the order of events that results in the conformational changes in the 
all important Met20 loop. 

The bottom panel in Fig. [3] shows the differences in the covariance matrices ACy = 
(Cij(CS)) — (Cij(OS)) in regions that are significantly different between the two allosteric 
states. The red region between the Met20 loop and the adenosine binding domain indicates 
more anti-correlated motions in OS than in CS. The blue region between the Met20 loop and 
the other two loops shows less anti-correlated motions and more correlated motions in OS than 
in CS. 
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Kinetics of CS^OS transition involves deformation of the Met20 loop 

In order to dissect the kinetics of structural changes in the Met20 loop during the forward 
(CS— >-OS) and the backward (CS— >OS) directions we have performed a number of simulations 
using the procedures described in the Methods section. Although it is clear that Met20 loop 
plays essential role in this transition, the order of events that drives its conformational change is 
not known . We monitor the Met20 loop kinetics using two surrogate reaction coordinates. 
One is the global RMSD, A@, that is obtained by aligning the instantaneous conformation 
of DHFR at time t either with the CS or the OS structure. During the CS— >OS transition, 
Ac, should increase with respect to the CS and decrease with respect to OS. From the time 
dependent variations in A^ we can infer the changes in the Met20 loop with respect to the 
entire structure. To determine the changes that are localized in the Met20 loop we calculated a 
local RMSD, A^, which uses only the coordinates of the active loop. From the time dependent 
changes in Ax, which is computed by aligning the Met20 loop and computing its RMSD (with 
respect to the starting conformation) during the two transitions, we can explicitly identify the 
dominant motions (translation, rotation, or twist) of the loop. The kinetics expressed in terms 
of the local coordinate A^ yields the conformational changes of only the Met20 loop. 

The time dependent changes in the global RMSD, Ac(t), with respect to the CS show 
considerable dynamical heterogeneity (Fig. HJ). The bottom panel in Fig. @J for one trajectory, 
shows that there are multiple recrossings across the transition region which is suggestive of a 
rather broad transition region (see below). Prior to the CS — > OS transition (t < 80/is in Fig. 
H} A G (t) undergoes substantial fluctuations which suggests that high energy states are being 
sampled while DHFR is in the CS. More importantly, such fluctuations can lead to infrequent 
visits to conformations that are similar to the OS state (see below). The broad distribution 
of transition times and multiple recrossings attests to the plasticity of the enzyme during the 
conformational transition. 

Although there is great diversity in the dynamics of the individual trajectories < Aa(t) > 
and < Ai(t) >, obtained by averaging over an ensemble of initial conformations, can be approx- 
imately described using a two-state model (Fig. [5] A). Comparison of Ac(t) and Ai(t) shows 
(Fig. [5]A.) that deformations of the Met20 loop occurs after the global motions in the CS— >OS 
transition. Because the long time values of < A L > are less than < A G > values, we surmise 
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that the structural changes in the Met20 loop involve translation and rotational motion towards 
the OS structure. 

Sliding of the Met20 loop across a2 is the rate limiting step in the CS^OS 
transition 

In order to understand the mechanism of the communication during the CS— ►OS transition 
we monitored the local movements of the Met20 loop and the helix a2 in the adenosine binding 
domain. Rupture of the contacts in the CS state (Asnl8-His45, Asnl8-Ser49, and Alal9-Ser49) 
and formation of Glul7-Ser49 during the CS^OS transition facilitates the sliding of Met20 along 
a;2 (Fig. [6A). The relative sliding motion between a2 and the Met20 loop enables NADPH to 
move closer to DHF. 

In the loop subdomain, the flexible Met20 loop interacts simultaneously with both the /3F-/3G 
and /3G-/3H loops [TJ In order dissect the order of events that occurs in the CS— >OS transition 
we have computed the kinetics of breakage and formation of a number of contacts involving the 
two loops (Fig. [6]A.-C ). By fitting the time- dependent changes in the formation and rupture 
of contacts to single exponential kinetics we find that the rupture of contacts between Met20 
loop and /3F-/3G loop in CS as well as formation of contacts between residues in the Met20 loop 
and /9G-/9H loop occur nearly simultaneously (on the fj,s time scale) (see (Fig. [6A-C)). Only 
subsequently (on a time scale of about 2 fis), the interaction between Glul7 (in Met20 loop) 
and Ser49 (in a 2) that exists only the CS state, takes place. Thus, the sliding of Met20 loop on 
a2 requires coordinated motion of a number of residues in the loop domain. 

We can further dissect the nature of the sliding motion of the Met20 loop along a2 by 
simultaneously measuring the changes in the angles and the distances between selected residues. 
We have computed the time-dependent changes in the distances between Asnl8-His45 (Ri), 
Alal9-Ser49 (R 2 ), Asnl8-Met42 (R 3 ) and Glul7-Ser49 (R 4 ), respectively. The sliding motion is 
vividly illustrated using the changes in the angles that the vectors Ri, R2, R3 and R4 make with 
the axis of a2. Angles are defined as on = cos~ l (Ri ■ Uwi) i = 1, 2, 3, 4 and Uh2 is the unit vector 
of the a.2 helix axis. The two-dimensional projection of (i = 1,2,3,4), that represents 

the values of (Ri, «j) that are sampled in the kinetic trajectories, shows that values decrease 
monotonically during the CS^OS transition. The averages over all the trajectories for also 
show a monotonic decrease. The averages also show that the 0^ values are either clustered 
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around the CS or the OS state. In other words, there is very little backtracking in the sliding 
movement of the Met20 loop along H2. The histogram of the angels and distances sampled 
during the transition in Fig. [7] also shows the fluctuations in a, (i = 1,2,3,4) are centered 
around the OS values which suggests that (in terms of these microscopic variables) that the 
transition occurs when the conformation is close to the OS state. This result is also in accord 
with the results in Fig. [6]A which shows that Glul7-Ser49 only forms when the interactions in 
the CS are ruptured, a process that occurs closer to the completion of the CS — > OS transition. 
The structural changes that accompany the sliding motion of the Met20 loop involves concerted 
motion of a number of residues (see the diagram on the left in Fig. [8]). The figures summarize the 
collective motions of residues in both the subdomains that facilitate the structural deformations 
in the Met20 loop. 

Cysteine crosslink inhibits CS— >OS transition 

The kinetics in both the forward and the backward (see below) transitions show that the 
coordinated motion in the loop subdomain plays an important role in enabling the Met20 loop 
communicate with adenosine binding domain. In the crystal structure of CS, the distance 
between the C a atoms of Metl6 and Glyl21 is about 4.3A. It is possible to mutate these 
residues to Cys to establish a disulfide cross link. We have simulated the kinetics of the CS— >OS 
transition in the crosslink mutant (referred to as CL) to assess the extent to which the motion of 
the Met20 loop is inhibited. Previously, it has been argued that constraining even residues that 
are 28 A apart can affect hydride transfer rates [4^]. Our purpose in studying the CL mutant 
is to see how the strain in the loop domain would affect the communication between the two 
domains. Since the disulfide bond constrains the distance between Met 16 and Glyl21 to 4.3 A, 
the anti-correlated motion between Met20 loop and /3F-/3G loop should be impeded. The time 
dependencies of < A L (CS\t) > and < A L (OS\t) > show that the Met20 loop does not fully 
adopt its conformation in the OS state (compare Fig. [5JA. and B). Similarly, the long time values 
of < Ag(CS) > and < Aa(OS) > in the mutant are different than in the WT (see Fig. EB). In 
the WT, /3G-/3H loop are involved in the coordinated motion between two domains. Surprisingly, 
the crosslink has little effect on the relative motion between /3G-/3H loop and the Met20 loop. 
The time dependent changes that monitor the formation of contacts between Trp22-Hisl49 and 
Asn23-Hisl49 are similar in the WT and the crosslink mutant. Because the interactions between 
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the Met20 loop and /3G-/3H loop are not fully inhibited in the CL, the sliding motion across a.2 
with the formation of Glul7-Ser49 can occur (Fig. [TOK) albeit less efficiently. We predict that 
due to the incomplete CS— >OS transition the crosslink will dramatically affect the rate of the 
forward hydride transition. Experiments using CL can shed further light on the importance of 
enzyme motion in catalysis which still remains controversial (1 (7. \l7\. 

Deformation of the Met20 loop drives the global motion during the OS— >CS 
transition 

The time constant for the local kinetics of the Met20 loop in the OS— >CS transition obtained 
from < A/,(t) > (Fig. [9]) is greater than the time scale in which < Aa(t) > changes. This 
implies that the Met20 gliding across a2 is the first event in the OS— >CS transition. In contrast, 
during the CS^OS transition, only in the final stages does the Met20 loop occludes the active 
site. 

Although the initial change in the OS— *-CS transition involves the deformation of the Met20 
loop (Fig. [9]) the microscopic events that drive this transition are distinct from those seen in the 
CS^OS transition. Remarkably, the rupture of Glul7-Ser49 occurs only after the formation of 
the contact between the Met20 loop and a>2. The time dependent changes in the contacts present 
only in the CS state (Asnl8-His45, Asnl8-Ser49, and Alal9-Ser49) occur while Glul7-Ser49 (in 
the OS state) contact still persists (Fig. [TUB). We suggest that binding of NADPH, which is 
required for THF to be released, assists in the formation of contacts between Met20 loop and 
/3F-/3G, and between Met20 loop and a2. Only after these contacts are established the contact 
between Glul7-Ser49 ruptures (Fig. [TUB). Upon rupture of the Glul7-Ser49 contact, the Met20 
loop slides back to its closed conformation, and THF is released. 

The simulations also show coordinated motions among the three loops in the loop subdomain 
during the OS— >CS transition (see the right side of Fig. [8]). From the analysis of the time- 
dependent changes in the distances between a number of residues we conclude that /3F-/3G loop 
stretches the Met20 loop by forming a number of contacts (Glyl5-Hisl24 ,Metl6-Glul20, Met 16- 
Glyl21, Metl6-Aspl22, Metl6-Thrl23, Glul7-Glyl21, and Glul7-Aspl22) with the Met20 loop. 
In concert with these events the strain imposed on the Met20 loop by the /3G-/3H loop is released 
by rupture of contacts (Trp22-Serl48, Trp22-Hisl49, and Asn23-Hisl49) with the Met20 loop. 
The pull (by the /3F-/3G loop) and push (by the /3G-/3H loop) action on the Met20 loop must 
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take place before the Met20 loop slides back to its conformation on the CS state (Fig. [HP). 
These results show that the pathways in the OS— *-CS transition are not the reverse of what 
transpires during the CS^OS transition. The structural changes in the Met20 loop and the 
concerted motions of a number of residues that drive these changes are shown on the right side 
of Fig. El 

Residues in the allostery wiring diagram code for ligand binding and dynamics 

The SCA predicts a number of residues that are expected to be relevant either in the motion 
of DHFR or in the function (Fig. [2A). Some of the residues in the network are related to cofactor 
binding and interaction with the active site while others are directly involved in accommodating 
the motion of the Met20 loop during the CS^OS transition. For example, SCA identified 
Leu28, Ala29, and Ser63 (Fig. [2]) all of which are involved in ligand binding or binding-involved 
dynamics. The amino acid at location 29, which in E. Coli DHFR is Ala, is in contact with 
His28 show isomerization between two isoforms of the apoenzyme [lj. In L. casei enzyme the 
conversion between the isoforms occurs only for the folate-bound complex while in human DHFR 
there appears to be only conformation in the methotrexate (MTX) DHFR complex [1]. The 
importance of Ser63 in maintaining hydrogen bond with NADPH was noted in the molecular 
dynamics simulations {(], Q] ■ Similarly, Asp27 is involved in hydrogen network with DHF in the 
active site 45j. The network predicted by SCA also contains Ile60 and Leu62 both of which have 
been recognized to be dynamically involved in interactions with Met20 loop. SCA also suggests 
that Ile94 and Gly97 should play a role in the function of DHFR. Because SCA cannot assess 



that neighboring residues Gly95 and 
I, LZl]. It is noteworthy that the SCA 



the importance of absolutely conserved residues it is like 
Gly96 may be relevant in the reaction cycle of DHFR 
identified a network of residues in the helical region a2 in the adenosine binding domain as 
being important. The present simulations show that the critical sliding motion of the Met20 
loop along a2 completes the allosteric transitions. Mutations in the region (Ile41-His45), that 
is far from the active site, have great influence on the forward hydride transfer reaction without 
affecting cofactor binding g[ 0]. It appears that the predictions of the SCA can be rationalized 
in light of a number of experimental and theoretical studies that have identified the importance 
of concerted motions among a sparse network of residues on the reaction cycle of DHFR. The 
sequence-based approach fails to identify key residues (Glyl21 being the most important) which 
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apparently plays a role in catalysis [18( . 

The average transition state structure resembles OS (CS) in the CS^OS 
(OS^CS) transition. 

We have used the global RMSD (Ac) as a surrogate reaction coordinate to determine the 
structures of the transition state ensemble (TSE). We assume that the transition state (TS) for 
a molecule is reached for the first time at t TS} if \A^,S(t TS ) — A^,S(tTs)\ < e (= 0.5A) is satisfied. 
Our criterion places the transition state equidistant (in terms of the global RMSD) from the CS 
and OS. Comparison of the contact maps (data not shown) for the TSE, CS, and OS shows that 
both the transitions exhibit major changes more with respect to the starting than the ending 
state. The largest changes between the CS and OS states, which take place in the Met20 and 
f3F — (3G loops, occur before the transition state is reached. 

The heterogeneity observed in the dynamics of the CS— >OS transition is also reflected in the 
distribution P(trs) of the transition time txs (Fig. fTTIA). Surprisingly, P(irs) is approximately 
uniform in the CS — > OS transition (Fig. [TTR). As a result, the TSE structures are much less 
heterogeneous in the forward than in the backward direction (Fig. [TTC). However, the spread 
in trs is broader in the forward direction compared to the backward direction. From the TSE 
we can compute a Tanford /3-like parameter, q$, (0 < q$ < 1) using 

I max(A^) — A* 
max(A^) — mm(At) 

where A* = (A G \tTs) + Aq^ts))/^, max(A^) and min(A^) are the maximum and minimum 
values of A* respectively. If is close to (1) then the most probable TS is starting (ending) 
allosteric state. For the CS — >OS transition the average value of is 0.66 (Fig. [TTB) which 
implies that the TSE structures are more OS-like (see the average TS structures in Figs. [5] and 
E]). Although the distribution P(g*) for the OS^CS is very broad (Fig. [TTD) the most probable 
is closer to the CS than to the OS. Thus, in both the transitions the average TSE structures 
resemble the high energy allosteric states. This observation supports the recent inferences drawn 
from the NMR relaxation time measurements [2] that the high energy conformation is populated 
(in the pre-equilibrium sense) in the both the allosteric states. From the Hammond postulate 
it follows that the TSE structures should resemble the high free energy states in accord with 
the present simulation results. Surprisingly, we further predict that the TSE structures for the 
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OS— >CS transition is conformationally much more heterogeneous than in the forward direction 
(Compare Figs. [TTB and [TUP). 

A small fraction of OS (CS) is present under equilibrium conditions in the CS 
(OS) state 

Although the dominant basin of attraction corresponds to a unique native folded state en- 
zymes can sample other conformations, albeit not frequently, through thermal fluctuations. 
Some of the conformations that are sampled in the ensemble of the equilibrated CS can corre- 
spond to the structures in the OS |2j. Allosteric mechanism based on the preexisting equilibrium 



48( | is qualitatively different from the induced-fit model 49| which posits that the conforma- 



48 



we calculated the 



tional transitions in the CS state occur only after the ligand binds. Indeed, several experiments, 
including the recent reports on DHFR [2j, suggest that a small population of OS conformations 
are in equilibrium with the CS structures. Similarly, we expect that CS structures should be 
accessible when the molecule is predominantly in the OS. 

In order to probe the validity of the conformational selection model 
distribution P(AAg) where 

AA G = Ag 5 - Ag s (3) 

where Aq S is the equilibrium RMSD of conformations in the CS with respect to the OS structure, 
and Aq S is the corresponding RMSD with respect to the CS structure. If DHFR is in the CS 
without ever sampling the OS-like structures then we expect that Aq S « 0. As a result, P(AAg) 
should be identically zero whenever A Ac < 0. Thus, the observation of negative values of A Ac 
is an indication of preexisting OS-like structures even under equilibrium conditions that favor 
CS (Fig. [T2l\). Figure fT2lA (Fig. [T2B) shows that P(AAg) is non-zero for a small range of 
negative (positive) A Ac in the ensemble of CS (OS). The population of the micro species CS 
(OS) in the OS (CS) basin is ~ 3% (~ 1%). Surprisingly, these estimates are similar to the 
values reported by Boehr et al. [2]. The presence of higher energy species also suggests, in 
accord with the Hammond postulate that the TS structure should be OS-like in the CS^OS 
transition. This inference, which follows from the conformational selection model is in accord 
with our simulations. We predict that mutations that destabilize either the CS or the OS will 
affect the kinetics of the allosteric transitions. 
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Concluding Remarks 

From the perspective of allostery, it is not surprising that communication between residues 



that are spatially well separated facilitates the CS-^OS transition [50||. We have used sequence- 
based method to identify a network of mechanically important residues that could control the 
kinetics of conformational transitions. The residues in the network are dispersed both in the 
adenosine binding and the loop domains. Coordinated motions among these residues and others 
control the structural transitions, and perhaps the forward and backward hydride transfer reac- 
tion. Surprisingly, several of these residues are not strongly conserved although their chemical 
character are often preserved across the various species. Hydride transfer experiments on the 
wi.d type DHFR and its mutants Q, Q have aheady pointed ont the importance of mmty of 
the residues in the network. In addition, all atom molecular dynamics simulations 6, 13 and 
NMR experiments [1|, 12J have implicated the key role of the network residues in the dynamics 
of DHFR. Although it is difficult to unambiguously establish a direct link between the DHFR 



3. 



the perturbation of these 



motions (equilibrium or dynamic) and hydride transfer reaction 
residues will affect magnetic resonance relaxation dispersion. 

The kinetics of the allosteric transitions in the forward (CS— >OS) and the reverse (OS— »CS), 
using the SOP model, reveal in great detail the order of events that results in the movement 
of the Met20 loop. In the forward direction, several contacts in the CS state rupture and new 
ones form in the OS. The concerted kinetics associated with these contacts, most of which are 
associated with the Met20, /3F-/3G, and /3G-/3H loops facilitate the sliding motion of the Met20 
loop so that it occludes the active site (Fig. [[]). Surprisingly, the pathways in the OS— >CS 
transition are not the reverse of the forward reaction. In particular, the interactions between 
Glul7-Ser49, whose rupture facilitates the sliding of the Met20 back to its CS position, persist 
till late in the OS— >CS transition. In the forward direction, Glul7-Ser49 contact occurs late for 
the sliding motion of Met20 along a2 to take place. The broad transition state region, both in 
the forward and backward directions, attests to the inherent plasticity of enzymes in general, 
and DHFR in particular. These results support the notion that mutations that inhibit the 
equilibrium fluctuations leading to the population of the minor species can adversely affect the 
rates of hydride transfer reaction. Indeed, the observed decrease in the hydride transfer rate in 



G121V has been rationalized using this picture 
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Of particular importance is the link between the present studies and the recent NMR re- 
laxation measurements Boehr et al. [2] which showed that, at equilibrium, there is a small 
percentage of OS structures in the ensemble of CS conformations. Similarly, when DHFR is 
in the OS state dynamical fluctuations populate a small (~ (1 — 3)%) of CS structures. Our 
simulation results are in accord with the NMR experiments 2| • These results support the emerg- 
ing notion that in enzymes conformations resembling the cofactor-bound structure is already 
present in the apoenzyme. The cofactors dynamically funnel the minor populations so that the 
equilibrium shifts to the haloenzyme. The present simulations show that such conformational 
fluctuations occur on /xs time scale. Because of the simplicity of the SOP model the estimated 
time scale should be taken as a lower bound. The ability to access the higher free energy states 
on {jjLS-ms) time scale is a consequence of the conformational heterogeneity of the enzyme which 
leads to low barriers separating the relevant kinetic states. In DHFR, this is reflected in the 
broad TSE with heterogeneous structures that results in a broad distribution of crossing times 
between the allosteric states. 

We also obtained the temperature (T) dependence of the rates of the forward and reverse 
transition for the WT and the forward transition for the CL. The rates were computed by fitting 
the time dependence of < A^(t) > for T in the range 285i^ < T < The averaging is 

performed using 20 trajectories. We find that the three rates follow the Arrhenius behavior. 
Because the SOP is a coarse-grained model the activation barrier is severely underestimated. 
Nevertheless, the results show that the rates of the allosteric transitions are enhanced as T 
increases. Needless to say that altering T might also change the reorganization free energies of 



the solvent which could be a dominant factor in determining the catalytic rates [13, |26|, 151 ]. 

The predicted temperature dependence of the CS^OS transition might provide a way to 
test the extent to which correlated enzyme dynamics is important in catalysis. If the rates of 
conformational fluctuations are drastically different from the rate of the chemical step then it is 
unlikely that the correlated motions of the enzyme is crucial to catalysis. On the other hand, if 
the two states are comparable then the conformational changes during the CS— >OS transition 
might be coupled to hydride transfer. In general, it is important to formulate a method that 
couples the dynamics of the CS— >OS transition (including the cofactors) and the chemical step. 
Such a formalism must account for both the kinetics of transitions along the lines described here 
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53] 



and the hydride transfer processes described by others [g, 1261] . 

The SOP model, which was introduced to carry out simulations of large systems 
does not include a number of relevant interactions. Most notably, the lack of explicit model for 
hydrogen bonds, prevents us from examining their role in the allosteric transitions. The role the 
network of hydrogen bonds of DHFR plays in affecting the CS— >OS transition can only be vicar- 
iously gleaned using the SOP model. On the other hand, the major advantage of the SOP model 
is that long time simulations for a large number of trajectories can be carried out. Indeed, the 
non-trivial prediction that the coordinated motions of specific residues throughout the structure 
trigger the movement of the Met20 loop is amenable to experimental tests. The non-conserved 
residues identified in this work can form the basis of future mutagenesis experiments. We believe 
that a combination of computational methods (sequence-based technique, coarse-grained and 
all atom MD simulations), and NMR, single molecule, and biochemical experiments are needed 
to fully dissect the interplay between enzyme motion and catalysis. 

Methods 

Statistical coupling analysis: 

In order to identify the residues that are involved in transmitting allosteric signals, we use 
our formulation 3^3] of the Seq uence-based Statistical Coupling Analysis (SCA) introduced by 



Lockless and Ranganathan 



34 



35 



36| . A statistical free energy-like function at each position, 



i, in a multiple sequence alignment (MSA) is defined as 



AG,; 



1 20 X 

where, &bT* is an arbitrary energy unit, Ci is the number of types of amino acid that appears 
at position i } p x is the mean frequency of amino acid x in the MSA. In eq.(l) = where 
is the number of times amino acid x appears at position i in the MSA, and iVj = Y^Li n i- 
The basic hypothesis of the SCA is that correlation or covariation between two positions i 
and j may be inferred by comparing the statistical properties of the MSA and a subalignment of 
sequences (derived from the MSA) in which a given amino acid is conserved (Sj = 0) at j. The 
restriction that Sj = in the subalignment is referred to as sequence perturbation at position 
j. The effect of perturbation is assessed using, 
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T Yl \.Pi,R ln ( 



p*ln{^ 2 



Ci ^ x " 1 p x ' Px 



(5) 



where pf R = nf R /Ni jR , n x iR and N itR are the number of sequences in the subalignment in which 
x appears in the i th position and N i}R = Yl x °=i n iR- 

In order to obtain statistically meaningful results using the SCA, it is crucial to choose 
the subalignments appropriately. Let / = p/Nmsa where p is the number of sequence in the 
subalignment and N MS a is the total number of sequences in the MSA. We choose / (= 0.35 for 



the DHFR family) to satisfy the central limit theorem 33| so that the statistical properties from 



the subalignments coincide with the full MSA. Using / = 0.35, we calculated the matrix elements 
AAG^ which estimate the response of position i in the MSA to all allowed perturbations at j 
(Sj = 0). The rows (labeled i) in AAG^ correspond to positions in the MSA. We determined 
the network of covarying residues using the elements AAGij in conjunction with coupled two- 



way clustering algorithm 



541 ] . The extent to which the rows AAGij and AAG^j are similar is 



assessed using the Euclidean measure [33j. Because AAG^ = for perfectly conserved positions 
and for sites where the amino acids are found at their mean frequencies in the MSA (pf = p%), 
the SCA cannot predict the role these residues play in the function or dynamics of the enzyme. 
Self-organized polymer (SOP) model for closed (CS) and Occluded (OS) states 
We have carried out Brownian dynamics simulations to obtain the kinetics of transitions 
between the allosteric states. Because of long-time scales involved in these transitions, we use 
a coarse-grained self-organized polymer (SOP) model for DHFR in the CS and OS states. The 
validity of the SOP model has been established in the context of mechanical unfolding of RNA 



and proteins 



allostery in GroEL 



531 ] . and conformational changes in kinesin [55j. In the 



SOP model, the structure of a protein is represented using only the C a coordinates. Because 
we are interested in the kinetics of CS— >OS and OS^CS transitions the energy functions are 
state dependent. The state-dependent energy function, in the SOP representation of protein 
structures in terms of the C a coordinates ri(i = 1,2, ...N with iV being the number of amino 
acids), is 



V{n\X} = V FENE + v£ B + v£ B 



(6) 
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where the label X refers to the allosteric state CS or OS. In Eq. El r i i+1 is the distance between 
two consequentive C^-atoms, and is the distance between the i th and j th C a atoms, and 
the superscript denotes their values in the crystal structure of the allosteric state X. The 
first term in Eq. El the finite extensible non-linear elastic (FENE) potential, accounts for chain 
connectivity. The stability of the state X is described by the non-bonded interactions (second 
term in Eqj6]) that assign attractions between residues that are in contact in X. Non-bonded 
interactions between residues that are not in contact in X are taken to be purely repulsive (third 
term in Eqj6|). The value of Ay is 1 if % and j are in native contact, and is zero otherwise. 

In the SOP model, there are only two independent parameters. The results are insensitive 
to the precise values of k, Ro, and e/. The two key parameters are eh, a single energy scale that 
describes the stability of state X, and the cutoff distance Rc for native contacts. Because Rc 
is, to a large extent, determined by the contact map, there is very little freedom in its choice. 
We assume that native contact exists if the distance between the i th and j th C a atoms is less 
than 8 A. The spring constant k in the FENE potential (first term in Eqj6]) for stretching the 
covalent bond is 20 kcal/ (molA 2 ), and the value of Rq, which gives the allowed extension of the 
covalent bond, is 2 A. 

Brownian dynamics simulations of allosteric transitions 

The kinetics of forward and backward transitions between CS and OS are probed using a 
method that was recently used to study allosteric transitions in GroEL [53(]. The basic assump- 
tion of the method is that the local strain that DHFR experiences (due to a ligand or cofactor 
binding) propagates faster across the structure than the conformational transitions that lead to 
the CS— ^OS and vice versa. In other words, the global structural relaxation is the slow step in 
the allosteric transition. 

Using the SOP model, we simulated the transition between the CS (PDB code 1RX2) and 
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OS states (PDB code 1RX7) by assuming that the dynamics of the protein can be adequately 
described by the Langevin equation in the overdamped limit. Transitions from one allosteric 
state (CS) to another (OS) are induced by starting from a conformation corresponding to the 
CS. The transition is induced by using the forces computed from the OS, i.e. from H{ri\OS}. 
The explicit equations of motion for CS^OS are, 

where H(ri\OS) is the Hamiltonian of the OS state, ( is the friction coefficient, F(t) is the 
random force, and is the position of the i th residue at time t. The initial (t = 0) value of 
is taken from the Boltzmann distribution at temperature T corresponding to the CS state, 

P(n(0)) ~ e - pH ^ CS) (8) 

where f3 = ■^f, and ks is the Boltzmann constant. The random force, F(t) satisfies 

< F(t) >= (9) 

and 

< F(t)F(t) >= 2k B T(5(t - t), (10) 
where the averages are over the trajectories. As long as potential conditions are satisfied 56] 



our method for inducing transitions ensures that, at long times, DHFR will explore the confor- 
mations corresponding to the OS state. At long times, the ensemble of conformations obeys the 
Boltzmann distribution corresponding to the OS state so that P(ri\(OS)) ~ e -0 H (n\os) _ Thus, 
on general theoretical grounds our procedure guarantees that CS— >OS transition can be realized, 
and that the dynamics represents the microscopic events that drive the transition of interest. 
To monitor the reverse reaction, we begin with an initial equilibrated ensemble of structures 
corresponding to the OS state and integrate the equations of motion (Eq. Cj) with the forces 
arising from the CS state. 

The procedure used to induce the allosteric transition hinges on the physically reasonable 
assumption that the time scales involved in the conformational changes in the enzymes are much 
longer than the time needed for locally-induced strain (due to cofactor binding for example) to 
propagate through the structure. In order to test this assumption, we varied the rate at which 

21 



the CS— >0S transition is allowed to take place by allowing r°(CS— >OS) to evolve slowly on a 
relatively long time scale. We accomplish the slower evolution using, 



(K - k)r°ACS) + kr°AOS) 

r°JCS -> OS) = - ' ljK J ^ -. (11) 

K 

The majority of the results were obtained using K — k — 1, To vary the strain propagation 
time we also performed simulations using K = 100 and increased k in steps. The CS— ^OS 
conformational switch was made smoothly over a range of times, namely, 0.12/is, 0.16/is, 0.2/is, 
and 0.24/zs. The kinetics of < Ac(t) > for the CS^OS transition, that reports on the time- 
dependence of the global RMSD changes, for the four cases coincide with the results in Fig. [5] 
and M (data not shown). These additional simulations justify the assumption underlying our 
procedure for inducing the conformational transition. 
Time scales and simulation details: 

In order to decipher the events that drive the conformational changes in the Met20 loop 
during the catalytic cycle, we performed three different simulations. We first simulated the 
kinetics of CS^OS transitions to determine the order of events during the forward reaction. In 
order to assess the roles of the residues G121 and M16 (the C a distance between the two is about 
4.3A apart in the CS state), in the CS— >OS transition we mutated M16-G121 interaction by a 
disulfide bond. Such a cross link, referred to as CL, can be made by mutating these two residues 
to cysteines. Disulfide bond is simulated by adding a FENE potential between M16 and G121 
(see Eq. [6]). The CL mutant allows us to assess the coupling between two distal residues, one in 
the Met20 loop and the other in the /3F-/3G loop, on the CS— *-OS transition. Experiments by 
Benkovic and coworkers have established that mutation of G121 can affect the hydride transfer 

n 

reaction [4|. Finally, we also simulated the OS— >CS transition which occurs during the release 
of THF. 

We used Brownian dynamics algorithm for which the characteristic time scale is th = 
A typical value for tl for proteins is 3ps jszj]. The simulations were performed with ( = IOOt^ 1 
which corresponds to water viscosity. The typical value of the integration step size is h = 0.16tl. 
During the transition from one allosteric state to another, we reduced h tenfold to maintain 
numerical stability. We first equilibrate DHFR in the CS at T=300K for 80 /xs. Subsequently, 
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the forces are computed using the OS state Hamiltonian. The equations of motion are integrated 
for long times (typically exceeding 700 /is) so that the transition to the OS state is complete. 
A similar procedure is used for the reverse reaction, and the cysteine crosslink mutants. We 
generated 50 trajectories for each conformational transition so that the reported results for the 
kinetics are statistically significant. 
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Figure Captions 

Figure [I] : Catalytic cycle and structures of closed and occluded states. (A) Scheme of 
catalytic cycle of DHFR that shows the two key conformations adopted by the enzyme. The 
Met20 loop closes the active loop in the E:NADPH:DHF complex, while it is occluded in the 
E:NADP+:THF complex. (B) Structure of the closed state (CS) (PDB code 1RX2) is on the left 
and the occluded state (OS) (PDB code 1RX7) is on the right. For clarity, we have explicitly 
labeled the structural elements that facilitate the allosteric transitions. The major changes are 
localized in the Met20 loop. 

Figure [2] : Sequence conservation with respect to the structure of the CS. (A) In the CS 
structure on the left we have color coded the backbone to reflect the extent of sequence conser- 
vation. Red color represents strong conservation (S(i) < 0.1) and non-conserved residues are in 
blue. The residues that are clustered in the position in the SCA (V13, 114, N18, M20, D27, L28, 
K38, V40, 141, M42, W47, 150, G51, N59, 160, L62, Q65, 194, V99, Ylll and T113) are shown 
in the middle structure. The colors of the residues indicate the values of the ScsE(i) with red 
representing strong conservation of the chemical identity. The right structure shows the network 
of residues that appear as perturbation in the SCA (Dll, R12, V13, D27, A29, K38, V40, W47, 
150, G51, P53, L62, V72, S77, A84, G97, V99, Q108, and Ylll). In the three structures the 
cofactor (NADPH and DHF) are shown using all-atom representation. (B) The top panel shows 
position dependent sequence entropy (S(i)) obtained by aligning the E. coli DHFR against the 
rest of the 426 sequences. Strongly conservation (S(i) < 0.1) is observed only for a small frac- 
tion of residues. The chemical sequence entropy, ScsE{i) in the bottom panel, shows that for a 
substantial fraction of residues only the chemical identity changes. These residues are dispersed 
throughout the structure. 

Figure [3] : Correlated and anti-correlated motions in DHFR. Covariance matrix of equilib- 
rium fluctuations of the unit vectors constructed from the coordinates of the C a atoms (Eq. 
[TJ of the wild type DHFR in the CS. The residues associated with the structural elements are 
shown on the left. The scale on the right measures the extent of correlation. In the bottom 
panel we show |Cy — C!? | using the same scale. For clarity, we only highlight those regions 
that are different in the two allosteric states. The simplicity of the SOP model has allowed us 
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to probe the equilibrium fluctuations on about 0.1 msec. 

Figure H] : RMSD as a function of time during the CS— ►OS transition. Time dependent 
changes in the global RMSD (Aa(t)) for a few representative trajectories as a function of t 
are given in the top panel. The dynamical changes in Aa(t) for one of the trajectories for the 
time interval enclosed by the box are shown below. The arrows show that frequent recrossings 
between CS and OS states occur prior to the completion of CS— >OS transition. 

Figure [5] : Kinetics of allosteric transitions probed using RMSD. Time dependent changes 
in the global, < A G (t) >, and the local, < A L (t) >, RMSD of the Met20 loop averaged over 
50 trajectories. The meaning of the symbols are given in the insets. The RMSDs are measured 
with respect to the starting and ending states. For example, < A G (CS) > means that the 
global RMSD is computed with respect to the CS. The changes in < A G (t) > and < Ai(t) > 
for the WT CS— >-OS are shown in (A), and (B) shows the results for the CL. The structures on 
the right in both (A) and (B) represent superposition of the CS, OS and the average transition 
state (TS) conformations. Conformation of the Met20 loop in the CS (green), OS (blue), and 
TS (red) are highlighted. The cross link between 16 and 121 is explicitly shown in (B). 

Figure [6] : Kinetics of rupture of individual contacts between residues in the various sub- 
structures of DHFR during the CS— >OS transition. (A) This panel shows the changes in the 
distance between the residues in the Met20 loop and a2 that rupture and form. (B) Same as 
(A) except the residues represent interactions between the Met20 loop and /3F-/3G loop. (C) 
shows the kinetics describing the formation of interactions between the Met20 loop and /?G-/3H 
loop. In all cases the identifies of the residues are shown on the right. 

Figure[7]: Illustration of the sliding of the Met20 loop along a2. Two dimensional projection 
of the distance and angle (ai) that are kinetically sampled in the 50 trajectories during the 
CS— >OS transition. The angles are defined in the text. The monitored residues are identified 
on the upper right corner. The colors grey, yellow, purple and cyan represent ensemble averages. 
The monotonic decrease in all the angles shows sliding of the Met20 loop along a2. Histograms 
of the distance and the angles for the residue pairs are displayed on the right and on top 
respectively. 

Figure |8] : Structural representation of the coordinated changes in the distances of residues 
that accompany the sliding motion in the CS— >OS (left side) and OS— >CS (right side) transition. 
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The arrows indicate the direction in which the structural changes occur. The displayed structural 
changes were inferred from the kinetics shown in Figs. El [TO], and [3 In the forward transition the 
Met20 loop is pulled by the /3G-/3H loop which results in it being pushed away from /3F-/3G loop. 
The push-pull process results in the sliding of the Met20 loop. The mechanism is approximately 
reserved in the OS— *-CS transition. 

Figure [9]: Changes in Ac{t) and A^(t) for the OS— >CS transitions. Same as Fig. [5] except 
that the transition is from OS^CS. The conformation of the Met20 loop in the TS (red) is 
different from that in Fig. [5j 

Figure [10] : Dissection of the local changes in the kinetics in CL (CS— >OS) and WT 
(OS^CS). (A) Time-dependent changes in the distances between select residues from the Met20 
loop and al for the CS— >OS transition in the CL. The transition is from CS— >OS. (B) Same as 
(A) except these represent changes that occur during the OS— >CS transition for the WT. 

Figure [TT] : Characteristics of the transition state ensemble (TSE). (A) and (C) show the 
distribution of transition times Pitrs) for the forward and reverse transition, respectively. (B) 
and (D) represent the distributions P(q^) of q$ (see Eq. [2]) for the CS— >OS and OS— >CS 
transitions respectively. In both cases the TSE is broad. However, the width of the TSE 
(inferred from P{q^)) in the reverse direction is larger. The fluctuation <qt ^-j^jr > is 0.2 for 
CS^OS and 0.6 for OS->CS. 

Figure QJ: Sampling of OS (CS) in CS (OS) state. (A) Distribution of P( A A G ), calculated 
using an ensemble of equilibrated conformations in the CS, as a function of AAg (Eq. [3]). The 
negative regions represent sampling of conformations that resemble the OS. (B) Same as (A) 
except P(AAg) is obtained from an ensemble of equilibrated structures in OS state. Under 
equilibrium conditions a minor population ~ (1 — 3)% of the product-like structures are present. 
The displayed structure in (A) is OS-like while the one in (B) is CS-like. 
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